Encoding

Text can be encoded in multiple ways. Most (older) text files use an encoding named ANSI, which has room for a limited amount of different characters, but is often sufficient to display all the text. However, Unicode encodings allow for a much richer amount of characters, allowing a single file to contain many languages at once, at the cost of an increase in filesize. Notepad++ will automatically try to detect the encoding used when opening a file, but allows you to change it when editing it. To simply change the displayed encoding (without modifying the actual text), select one of the Format->Encode in options from the Format menu. The convert the text to a certain encoding, select one of the Format->Convert to options in the format menu.

It can happen that a file is saved with a certain encoding, but upon reopening it in Notepad++ it is detected with another encoding. This is a technical limitation and happens because sometimes the resulting file will not differ even though different encodings are used. This is most noticeable if the file is saved without a special BOM (Byte Order Mark) indicating the used encoding.

Notepad++ offers the following encoding schemes:

ANSI
Older encoding, smallest filesize but error prone due to use of various codepages
UTF-8
Unicode encoding, most Western character take one byte of filesize, but other character can take up more, 3 to 4 most commonly. A three byte BOM will be added upon save.
UTF-8 without BOM
Like UTF-8, but no BOM is added. Saves three bytes, but makes encoding detection harder.
UTF-16 Little Endian
All characters are two bytes in size, pairs are Little Endian ordered. A 4 byte BOM is added upon save.
UTF-16 Big Endian
All characters are two bytes in size, pairs are Big Endian ordered. A 4 byte BOM is added upon save.

In addition, since version 5.6, Notepad++ supports changing the character set being used to display the text, exactly the way you can change it on most web browsers. Thiese encodings are available using the Character sets menu entry which comes right after the Encode in ... family items.

Note that, for HTML and XML files, Notepad++ attempts to detect the encoding being used when the file is opened, thus avoiding a number of errors which may not show before the file is being used on a server.